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Abstract. We present an overview of some recent efforts aimed at the 
development of Choreographic Programming^ a programming paradigm 
for the production of concurrent software that is guaranteed to be correct 
by construction from global descriptions of communication behaviour. 


1 Introduction 

Programming communications among the endpoints in a concurrent system is 
challenging, because it is notoriously difficult to predict how nontrivial programs 
executed simultaneously may interact [19]. To mitigate this issue, choreographies 
can be used to give precise specifications of communication behaviour [28,1]. 

A choreography specifies the expected communications among endpoints 
from a global viewpoint, in contrast with the standard methodology of giving a 
separate specification for each endpoint that defines its Input/Output (I/O) ac¬ 
tions. As an example, consider the following choreographic specification (whose 
syntax is derived from the “Alice and Bob” notation from [23]): 

Alice -> Bob : book; Bob -> Alice : money 

The choreography above describes the behaviour of two endpoints, Alice and 
Bob. First, Alice sends to Bob a hook] then. Bob replies to Alice with some 
money as payment. The motivation for using a choreography as specification is 
that it is always “correct by design”, since it explicitly describes the intended 
communications in a system. In other words, a choreography can be seen as a 
formalisation of the communication flow intended by the system designer. 

A choreography can be compiled to the local specifications of the I/O actions 
that each endpoint should perform [27,18,10], as depicted below: 

EPP 

-> Endpoint Spec. 

(correct by construction) 

In the methodology above, an Endpoint Projection (EPP) procedure is used to 
generate the specifications for each endpoint starting from a global choreographic 
specification. The endpoint specifications are therefore correct by construction, 
because they are computed from a correct-by-design choreography. A major 
consequent benefit is that such endpoint specifications are also deadlock-free. 


Choreography Spec. 
(correct by design) 





because I/O actions cannot be defined separately in choreographies and are 
therefore always paired correctly in the result of EPP. 

In this paper, we give an overview of some recent results by the author 
and collaborators aimed at applying the choreography-based methodology as a 
fully-fledged programming paradigm, rather than as a specification method. In 
this paradigm, called Choreographic Programming^ choreographies are concrete 
programs and EPP is a compiler targeting executable distributed code: 

EPP (compiler) 

Choreography Program -> 

(correct by design) (correct by construction) 

Ideally, this methodology will allow developers to program systems from a global 
viewpoint, which is less error-prone than writing endpoint programs directly, and 
then to obtain executable code that is correct by construction. 

To kickstart the development of choreographic programming, we are inter¬ 
ested in finding suitable language models (§ 2) and their implementation (§ 3). 
We discuss them in the remainder of the paper, following the syntax from [20]. 


Executable 
Endpoint Programs 


2 Language models 


In [11] we present the Choreography Calculus (CC), a language model for choreo¬ 
graphic programming that follows the correct-by-construction methodology dis¬ 
cussed in § 1 and provides an interpretation of concurrent behaviour in choreogra¬ 
phies. The key first-class elements of CC are processes and sessions, respectively 
representing endpoints that execute concurrently and the conversations among 
them. The basic statement of choreographic programs, ranged over by C, is a 
communication: 

p.e -> q.x : k; C 

which reads “process p sends the value of expression e to process q, which re¬ 
ceives it on variable x, over session k] then, the system executes the continuation 
choreography C”. We comment the model by giving the following toy example 
on a replicated journaling file system. 

Example 1 (Replicated Journaling Eile System, write operation). We define a 
choreography, denoted Cjfs, in which a client c uses a session k to send some 
data to be written in a journaling file system replicated on two storage nodes. 


Ci 


def 


jfs 


1. c.data -> ii.datai : k; 

2. c.data -> j 2 .data 2 : k; 

3. ji.blocks(datai) -> si.blocksi : k'; 

4. j2.blocks((iata2) -> S2.blocks2 : k'; 

5. ji -> c : k; 

6. j 2 -> c: k 


In the choreography Cjfs, the client c uses session k to send the data to be 
written to two processes, ji and j 2 , which we assume log the operation in their 


2 





respective journals upon reception (Lines 1-2). The two journal processes then 
use another session, /c', to forward the data to be written to their respective 
processes handling the actual data storage, Si and S 2 (Lines 3-4). Finally, at the 
same time, processes ji and j 2 send an empty message on session k to the client, 
in order to inform it that the operation has been completed (Lines 5-6). 

Concurrency. Process identifiers (c, ji, j 2 , Si and S 2 in our example) are key to 
formalising concurrent behaviour in CC. Observe Lines 3-4: since processes run 
in parallel, the communication between j 2 and S 2 in Line 4 could be completed 
before the communication between ji and Si in Line 3. In CC, the semantics 
of the sequential operator is thus relaxed by a syntactic swapping congruence 
relation which allows two statements to be swapped if they do not share 
any processes. For example, the choreography Cjfs would be equivalent to a 
choreography denoted Cjfs —c C'/fs? where in Lines 3 and 4 are exchanged. 
In [20], the relation c^c is validated by showing that it corresponds to the typical 
interleaving semantics of the parallel operator found in process calculi. 

Sessions and Typing. The communications in Lines 1-2, 5-6 and the commu¬ 
nications in Lines 3-4 are included in different sessions, respectively k and kk 
Each session represents a logically-separate conversation, as in other session- 
based calculi (e.g., [15,4]), and is strongly typed in CC with a typing discipline 
that checks for adherence to protocols expressed as multiparty session types [16]. 
We give an example of how protocols are mapped to choreographies in § 3. 

Endpoint Projection. CC comes with an EPP that compiles choreographies to 
distributed implementations in terms of the 7r-calculus [II]. The generated code 
follows that of the originating choreography, according to a small-step opera¬ 
tional semantics. As a corollary, the produced code is also deadlock-free: senders 
and receivers are always ready to communicate when they have to, as I/O actions 
cannot be mismatched in choreographies. 

Modularity. In [22], we extend CC to support the implementation and reuse 
of external libraries/services (modular development), using a notion of external 
participants in sessions. Eor example, we can split the choreography Cjfs in two 
modules, a client choreography Cdi and a server choreography Csrv- 

1- C -> ji.datai : k; 

2. C -> ]2.data2 : k; 

^ d^f 3. ji.blocks(datai) -> si.blocksi : k'; 

4. j2.blocks((iata2) -> S2Mocks2 : k'] 

5. ji -> C : k] 

6. j2 “> C : /c 

The choreographies above refer to each other using references to external pro¬ 
cesses, e.g., JI in Ccw is a reference to process ji in Csrv Separate choreography 
modules can be compiled and deployed separately, with the guarantee that their 
generated implementations will interact with each other as expected. 
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Fig. 1. Chor, development methodology (from [11]). 


program simple; 

-protocol SimpleProtocol { 
C -> S: hi( string ) 

} 


public a: SimpleProtocol 

main 

{ 

client[C] start server[S] : a( k );| 


© } 


©Operation wrong is not expected by the type for session k 


Fig. 2. Chor, example of error reporting (from [20]). 


Extraction. In [12], we present a proofs-as-programs Curry-Howard correspon¬ 
dence between Internal Compositional Choreographies (ICC, a simplification of 
CC) and a generalisation of Linear Logic [14], inspired by [8]. ICC is a first step 
in defining a canonical model for choreographies and formalising logical rea¬ 
soning on choreographic programs. In such correspondence, EPP is formalised 
as a transformation between logically-equivalent proofs, one corresponding to a 
choreography program and the other corresponding to a 7r-calculus term. The 
transformation is invertible, yielding a procedure for automatically extracting 
the choreography that a 7r-calculus term typed with linear logic is following. 

3 Implementation 

The Choreography Calculus (CC), along with related work on models for chore¬ 
ography languages [27,18,10], offers insight on how choreographic programming 
can be formally understood as a self-standing paradigm. To practically evaluate 
choreographic programming, we developed the Chor programming language^, an 
open source prototype implementation of CC [20]. 

In Chor, the correct-by-construction methodology of choreographic program¬ 
ming is proposed as a concrete software development process, depicted in Fig. 1. 
Choreographies are written using an Integrated Development Environment (IDE), 
which visualises on-the-fiy errors regarding syntax and protocol verification, as 
in the screenshot in Fig. 2. Then, a choreography can be projected to executable 

^ http://www.chor-lang.org/ 
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code via an implementation of EPP that follows the ideas of CC. In this case, 
the target language is Jolie^ [21]. Once the compiler has generated the Jolie 
programs for the endpoints described in the choreography, the developer can 
customise their deployments. This is done using the Jolie primitives for integrat¬ 
ing with standard communication protocols and technologies, which do not alter 
the behaviour of the code generated by the Chor compiler. The resulting code 
can finally be executed using the Jolie interpreter. 

In Chor, the syntax from CC is extended with operation names for com¬ 
munications (as in Web Services [2]) and data manipulation primitives. As an 
example, we show an extended implementation of the scenario from Example 2. 

protocol Write { 

C -> Jl: { write( string) ; 

C -> J2: write( string ) ; 

Jl -> C: ok(void) ; 

J2 -> C : ok (void) , 
writeAsync (string) ; 

C -> J2: writeAsync( string ) 

} 

} 

protocol Store { Jl -> SI: write( string ) ; 

J2 -> S2 : write( string ) } 

define computeBlocks(j1, j2) { /* ... */ } 

define write(c, jl, j2, si, s2) 

(k[ Write:c[C], jl[Jl], j2[J2] ], 
k2 [ Store : jl [Jl] , j2[J2], sl[Sl], s2 [S2] ]) { 

if (sync)(9c { 

c.data -> jl.data : write(k); 

c.data -> j2.data : write(k) ; 

computeBlocks( jl, j2 ); 

jl.blocks -> si.blocks : write( k2 ); 

j2.blocks -> s2.blocks : write( k2 ); 

jl -> c : ok( k ); 

j2 -> c : ok( k ) 

} else { 

c.data -> jl.data : writeAsync( k ); 
c.data -> j2.data : writeAsync( k ); 
computeBlocks( jl, j2 ); 
jl.data -> si.data : write( k2 ); 
j2.data -> s2.data : write( k2 ) 

} 

} 

We briefly comment the program above, referring the reader to [20] for a more 
complete description of Chor. Procedure write implements the behaviour of the 

^ http://www.jolie-lang.org/ 
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processes from Example 2. The sessions k and k2 {k and k' in Example 2) are 
typed by the protocols Write and Store respectively. In Line 19, the client c 
checks its internal variable sync to determine whether the write operation should 
be synchronous or not. In the first case we proceed as in Example 2. Otherwise, 
process c uses a different operation writeAsync to notify the others that it does 
not expect a confirmation message at the end. 

4 Related Work 

The idea of using choreography-like descriptions for communication behaviour 
has been used for a long time, for example in software engineering [17], secu¬ 
rity [9,7,5], and specification languages for business processes [28,1]. 

The development of the formal models that we described in § 2 was made 
possible by many other previous works on languages for expressing communica¬ 
tion behaviour. The notion of session in CC is a variation of that presented in [4] 
for a process calculus. The theory of modular choreographies was inspired by the 
article [3], where types for I/O actions are mixed with types for global commu¬ 
nications, and by Multiparty Session Types [16], from which we took the type 
language to interface compatible choreographies. Interestingly, combining multi¬ 
party session types with choreographies yields a type inference technique and a 
deadlock-freedom analysis that do not require additional machinery as in other 
works in the context of processes [4]. The criteria for a correct Endpoint Projec¬ 
tion (EPP) procedure was investigated in many settings, e.g., in [27,6,18,10]. 

The Chor language and its compiler have already been used as basis for im¬ 
plementing other projects. Eor example, AIOCJ [26] is a choreographic language 
supporting the update of executable code at runtime, equipped with a formal cal¬ 
culus that ensure deadlock-freedom [25]. Choreographies have also been applied 
for the design of communication protocols. In particular. Scribble is a specifica¬ 
tion language for protocols written from a global viewpoint [29], which can be 
used to generate correct-by-construction runtime monitors (see, e.g., [24]). 

5 Conclusions and Future Work 

We presented some recent efforts aimed at kickstarting the development of chore¬ 
ographic programming as a fully-fiedged programming paradigm. While the 
paradigm holds potential, there is still a lot of work to be done before reaching 
a productive real-world programming framework. We describe below some pos¬ 
sible research directions, some of which are planned for in the current research 
project behind Chor, the CRC project^. 

Integration. A key factor for the adoption of choreographic programming will be 
interoperability with existing software. Chor can be extended with local compu¬ 
tation primitives that would interact with libraries written in other programming 
languages, e.g., Java or Scala, similarly to how it is done in Jolie [21]. 

^ http://www.chor-lang.org/ 


6 



Classification. Just like there are many different language models for different 
aspects of concurrent programming, e.g., code mobility and multicast, it should 
be possible to similarly extend choreography models. This suggests a potential 
benefit in having systematic classifications of choreography languages, to observe 
the effect that such extensions have on expressiveness and see how far the correct- 
by-construction methodology can be applied. 

Exceptions. Introducing exception handling in choreography program raises the 
issue of coordinating many participants in a global escape (as in [13]), and 
whether a suitable strategy can always be found, statically or at runtime. 

Formal Implementation. The EPP procedure in CC is based on 7r-calculus chan¬ 
nels, but its implementation in Chor uses data (protocol headers) to route mes¬ 
sages instead, as in many other enterprise frameworks [2]. To the best of the 
author’s knowledge, realising 7r-calculus channels using data-based message rout¬ 
ing has still to be formally investigated, and the implementation of Chor could 
provide an initial stepping stone in such a study. 

Acknowledgements. The author was supported by the Danish Council for In¬ 
dependent Research project Choreographies for Reliable and efficient Communi¬ 
cation software (CRC), grant no. DFF-4005-00304, and by the EU COST Action 
IC1201 Behavioural Types for Reliable Large-Scale Software Systems (BETTY). 
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